HiClus: Highly Scalable Density-based Clustering with Heterogeneous Cloud
نویسندگان
چکیده
منابع مشابه
HiClus: Highly Scalable Density-based Clustering with Heterogeneous Cloud
Definition 1. (directly density-reachable): A point p is directly density-reachable from a point q wrt. Eps, MinPts if 1. p ∈ NEps(q) and 2. |NEps(q)| ≥MinPts NEps(q) represents the Eps-neighborhood of the point q which includes all points whose distance to the point q is less than or equal to Eps. From the statement (2), it can be understood that there is a cluster if and only if there is a co...
متن کاملScalable Density-Based Distributed Clustering
Clustering has become an increasingly important task in analysing huge amounts of data. Traditional applications require that all data has to be located at the site where it is scrutinized. Nowadays, large amounts of heterogeneous, complex data reside on different, independently working computers which are connected to each other via local or wide area networks. In this paper, we propose a scal...
متن کاملDensity-Based Subspace Clustering in Heterogeneous Networks
Many real-world data sets, like data from social media or bibliographic data, can be represented as heterogeneous networks with several vertex types. Often additional attributes are available for the vertices, such as keywords for a paper. Clustering vertices in such networks, and analyzing the complex interactions between clusters of different types, can provide useful insights into the struct...
متن کاملPAPAyA: A Highly Scalable Cloud-based Framework for Genomic Processing
The PAPAyA platform has been designed to ingest, store and process in silico large genomics datasets using analysis algorithms based on pre-defined knowledge databases with the goal to offer personalized therapy guidance to physicians in particular for cancers and infectious diseases. This new highly scalable, secure and extensible framework is deployed on a cloud-based digital health platform ...
متن کاملDesign and Implementation of Scalable Hierarchical Density Based Clustering
Clustering is a useful technique that divides data points into groups, also known as clusters, such that the data points of the same cluster exhibit similar properties. Typical clustering algorithms assign each data point to at least one cluster. However, in practical datasets like microarray gene dataset, only a subset of the genes are highly correlated and the dataset is often polluted with a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Procedia Computer Science
سال: 2015
ISSN: 1877-0509
DOI: 10.1016/j.procs.2015.07.289